Fast robust inverse transform speaker adapted training using diagonal transformations

نویسندگان

  • Hubert Jin
  • Spyridon Matsoukas
  • Richard M. Schwartz
  • Francis Kubala
چکیده

We present a new method of Speaker Adapted Training (SAT) that is more robust, faster, and results in lower error rate than the previous methods. The method, called Inverse Transform SAT (ITSAT) is based on removing the differences between speakers before training, rather than modeling the differences during training. We develop several methods to avoid the problems associated with inverting the transformation. In one method, we interpolate the transformation matrix with an identity or diagonal transformation. We also apply constraints to the matrix to avoid estimation problems. Finally, we show that the resulting method is much faster, requires much less disk space, and results in higher accuracy than the original SAT method.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fast Robust Inverse Transform SAT and Multi-stage Adaptation

We present a new method of Speaker Adapted Training (SAT) that is more robust, faster, and results in lower error rate than the previous methods. The method, called Inverse Transform SAT (ITSAT) is based on removing the di erences between speakers before training, rather than modeling the di erences during training. We develop several methods to avoid the problems associated with inverting the ...

متن کامل

Maximum Likelihood Lineartransformations for Hmm

This paper examines the application of linear transformations for speaker and environmental adaptation in an HMM-based speech recognition system. In particular, transformations that are trained in a maximum likelihood sense on adaptation data are investigated. Other than in the form of a simple bias, strict linear feature-space transformations are inappropriate in this case. Hence, only model-b...

متن کامل

Variance compensation within the MLLR framework for robust speech recognition and speaker adaptation

This paper investigates the use of maximum likelihood linear regression (MLLR) for both speaker and environment adaptation. MLLR transforms the mean and variance parameters of a set of HMMs. In this paper a number of different types of linear transformations of the variances are examined including full, block diagonal, and diagonal transformation matrices. Experiments on large vocabulary speake...

متن کامل

Training Robust Acoustic Models Using Features of Pseudo-Speakers Generated by Inverse CMLLR Transformations

In this paper a novel speech feature generationbased acoustic model training method is proposed. For decades, speaker adaptation methods have been widely used. All existing adaptation methods need adaptation data. However, our proposed method creates speaker-independent acoustic models that cover not only known but also unknown speakers. We do this by adopting inverse maximum likelihood linear ...

متن کامل

Acoustic Model Training Using Pseudo-Speaker Features Generated by MLLR Transformations for Robust Speaker-Independent Speech Recognition

A novel speech feature generation-based acoustic model training method for robust speaker-independent speech recognition is proposed. For decades, speaker adaptation methods have been widely used. All of these adaptation methods need adaptation data. However, our proposed method aims to create speaker-independent acoustic models that cover not only known but also unknown speakers. We achieve th...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998